
                              ABOUT REVERSING
                              ===============

                                  ABSTRACT
                                  --------

     Reversing  of  executable files is the only base to write undetectable
 viruses.
     This is based on the following axiom: complexity C1 of detecting virus
 itself,  when  virus  location  is  given,  and  complexity  C2 of finding
 possible virus locations within infected objects, are different; and total
 complexity of detecting virus precence is a product of them, i.e. C1 * C2.
 Both  complexities are interrelated; and both are limited by the object to
 be  infected. This means that there exists some maximal complexity, which,
 when  reached, will divide object and virus into different parts. As such,
 our  task  is  to  build  optimal infection methods: when product of these
 complexities  will  be  maximal,  but  not  critically high, and thus only
 iteration-based detection methods will be effective.
     For  example.  Writing poly decryptor is good, but inserting it always
 into constant place, such as end of last section, is bad. Writing very big
 poly  decryptor is bad in any case. Putting plain virus, into any place of
 the  program,  even  into random place, is bad. So, the questions are: how
 much  should be the virus polymorphic; in how many ways may it be inserted
 into  file;  and,  because these two things are interrelated, where is the
 optimal combination.
     To  find  answers to these questions, virus must know everything about
 itself  and  about file to be infected. First part can be easily achieved;
 this  was  shown  us  in  lots of metamorphic viruses. Second part is much
 harder,  and  this  is  also the subject of this article: how can virus to
 find out more information about the file it want to infect.

                                BASE METHODS
                                ------------

     There  are  only  two  common  methods  i  can  see: disassembling and
 tracing, or, in other words, static and dynamic analysis.
     By  combining  these  methods it is possible to find such things about
 each useful instruction of the file, which were never imagined before.

                                SAMPLE FILE
                                -----------

     Sample file performs the following actions:

 1. REP CMPSB, to show how tracer will bypass similar instructions
 2. "Multi-exec" code, i.e. opcodes executed more than once, or in cycles,
    or code in subroutines called more than once
 3. Thread creation
 4. Exception generation and handling (SEH)
 5. "Multi-thread" code, i.e. opcodes executed in different threads

     Here  are  shown  a  program,  i.e. offsets and opcodes, which will be
 later used in disassembly/trace results:

; CODE section
00401000 BE 00 10 40 00    start:          mov     esi, offset start
00401005 8B FE                             mov     edi, esi
00401007 B9 00 10 00 00                    mov     ecx, 4096
0040100C F3 A6                             repe    cmpsb
0040100E B8 0A 00 00 00                    mov     eax, 10
00401013 48                multiexec:      dec     eax
00401014 75 FD                             jnz     short multiexec
00401016 E8 55 00 00 00                    call    multiexec_and_multithread
0040101B 50                                push    eax
0040101C 54                                push    esp
0040101D 6A 00                             push    0
0040101F 68 3F 10 40 00                    push    offset quit
00401024 68 46 10 40 00                    push    offset new_thread
00401029 6A 00                             push    0
0040102B 6A 00                             push    0
0040102D E8 4C 00 00 00                    call    CreateThread
00401032 58                                pop     eax
00401033 68 E8 03 00 00    halt:           push    1000
00401038 E8 3B 00 00 00                    call    Sleep
0040103D EB F4                             jmp     short halt
0040103F 6A FF             quit:           push    -1
00401041 E8 2C 00 00 00                    call    ExitProcess
00401046 55                new_thread:     push    ebp
00401047 E8 06 00 00 00                    call    seh_init
0040104C 8B 64 24 08       seh_handler:    mov     esp, [esp+8]
00401050 EB 15                             jmp     short seh_done
00401052 64 67 FF 36 00 00 seh_init:       push    dword ptr fs:[0]
00401058 64 67 89 26 00 00                 mov     fs:[0], esp
0040105E E8 0D 00 00 00                    call    multiexec_and_multithread
00401063 33 C0                             xor     eax, eax
00401065 F7 F0             exception:      div     eax
00401067 64 67 8F 06 00 00 seh_done:       pop     dword ptr fs:[0]
0040106D 58                                pop     eax
0040106E 5D                                pop     ebp
0040106F C3                                retn
00401070 90     multiexec_and_multithread: nop
00401071 C3                                retn
00401072 FF 25 38 30 40 00 ExitProcess:    jmp     ds:__imp_ExitProcess
00401078 FF 25 3C 30 40 00 Sleep:          jmp     ds:__imp_Sleep
0040107E FF 25 40 30 40 00 CreateThread:   jmp     ds:__imp_CreateThread
; .idata section
00403038 ?? ?? ?? ??                       extrn __imp_ExitProcess:dword
0040303C ?? ?? ?? ??                       extrn __imp_Sleep:dword
00403040 ?? ?? ?? ??                       extrn __imp_CreateThread:dword

                               DISASSEMBLING
                               -------------

     At  first,  if  we  will  sequentally  parse  the file, instruction by
 instruction,  and  collect  all  available info, the following list can be
 generated: (using Mistfall engine)

00401000: SECTALIGN
00401000: LABEL, DREF (flag=00000108)
00401000: OPCODE  BE
00401001: FIXUP 00001000
00401005: OPCODE  8B FE
00401007: OPCODE  B9 00 10 00 00
0040100C: OPCODE  F3 A6
0040100E: OPCODE  B8 0A 00 00 00
00401013: LABEL, CREF (flag=00000088)
00401013: OPCODE  48
00401014: OPCODE  75 FD
00401016: OPCODE  E8 55 00 00 00
0040101B: OPCODE  50
0040101C: OPCODE  54
0040101D: OPCODE  6A 00
0040101F: OPCODE  68
00401020: FIXUP 0000103F
00401024: OPCODE  68
00401025: FIXUP 00001046
00401029: OPCODE  6A 00
0040102B: OPCODE  6A 00
0040102D: OPCODE  E8 4C 00 00 00
00401032: OPCODE  58
00401033: LABEL, CREF (flag=00000088)
00401033: OPCODE  68 E8 03 00 00
00401038: OPCODE  E8 3B 00 00 00
0040103D: OPCODE  EB F4
0040103F: LABEL, DREF (flag=00000108)
0040103F: OPCODE  6A FF
00401041: OPCODE  E8 2C 00 00 00
00401046: LABEL, DREF (flag=00000108)
00401046: OPCODE  55
00401047: OPCODE  E8 06 00 00 00
0040104C: OPCODE  8B 64 24 08
00401050: OPCODE  EB 15
00401052: LABEL, CREF (flag=00000088)
00401052: OPCODE  64 67 FF 36 00 00
00401058: OPCODE  64 67 89 26 00 00
0040105E: OPCODE  E8 0D 00 00 00
00401063: OPCODE  33 C0
00401065: OPCODE  F7 F0
00401067: LABEL, CREF (flag=00000088)
00401067: OPCODE  64 67 8F 06 00 00
0040106D: OPCODE  58
0040106E: OPCODE  5D
0040106F: OPCODE  C3
00401070: LABEL, CREF (flag=00000088)
00401070: OPCODE  90
00401071: OPCODE  C3
00401072: LABEL, CREF (flag=00000088)
00401072: OPCODE  FF 25
00401074: FIXUP 00003038
00401078: LABEL, CREF (flag=00000088)
00401078: OPCODE  FF 25
0040107A: FIXUP 0000303C
0040107E: LABEL, CREF (flag=00000088)
0040107E: OPCODE  FF 25
00401080: FIXUP 00003040
00402000: SECTALIGN
00403000: SECTALIGN
00403038: LABEL, DREF (flag=00000108)
00403038: RVA 00003056
0040303C: LABEL, DREF (flag=00000108)
0040303C: RVA 00003064
00403040: LABEL, DREF (flag=00000108)
00403040: RVA 0000306C

     This  list  contains all the info (really it has been reduced) that we
 need  to rebuild the whole program. By means of adding new entries to such
 list  we  can  add  new  instructions to the program. By means of deleting
 entries,  we  can  remove  instructions  from the program. Because all the
 dependencies  between  instructions  are present in such list, it contains
 not relative arguments (in case of jmp/call-alike instructions), but links
 to  another list entries. This is like a fishing net, which were raised up
 by  millions  hands, after untied; and now you can command to each hand to
 do  whatever  you  want with corresponding element; and then, after net is
 put down and automatically tied, everything will be as before, except some
 changes you did.

     Such  list  is,  first, tool and method to work with executable files;
 and,  second,  it  is  enumerator  object,  which allow you to sequentally
 analyze each instruction.

                                  TRACING
                                  -------

     Tracing  (using  win32  api)  allows  us  to find order of instruction
 execution,  and some other things. Process is traced for few random amount
 of time, or until it does something wrong.
     The following things can be found out using tracing:

   1. If instruction is really executed
   2. Execution order, global (per process) and local (per thread)
   3. Instruction, previous to each one
   4. Cycles and subroutines, i.e. "multi-exec" code.
   5. Threads and "multi-thread" code.
   6. Exception addresses and some exception handlers.
   7. Variables which were modified after tracing
   8. Order and names of API functions called
   9. Everything alredy said above, but also for each DLL used
  10. Other modules and DLLs loaded by main executable,
      was it static load or LoadLibrary
      ...

     Here is a list, generated while tracing the sample code:

original   previous   thread     global    thread#     # of     M0  M1   flags
   va         va      index      index                 execs
00401000 p=77F9279F l=00005DEA g=00005DEA t=00000001 c=00000001 BE->BE [ R_OPCODE ]
00401005 p=00401000 l=00005DEB g=00005DEB t=00000001 c=00000001 8B->8B [ R_OPCODE ]
00401007 p=00401005 l=00005DEC g=00005DEC t=00000001 c=00000001 B9->B9 [ R_OPCODE ]
0040100C p=00401007 l=00005DED g=00005DED t=00000001 c=00000001 F3->F3 [ R_OPCODE ]
0040100E p=0040100C l=00005DEE g=00005DEE t=00000001 c=00000001 B8->B8 [ R_OPCODE ]
00401013 p=00401014 l=00005DEF g=00005DEF t=00000001 c=0000000A 48->48 [ R_OPCODE R_MULTIEXEC ]
00401014 p=00401013 l=00005DF0 g=00005DF0 t=00000001 c=0000000A 75->75 [ R_OPCODE R_MULTIEXEC ]
00401016 p=00401014 l=00005E03 g=00005E03 t=00000001 c=00000001 E8->E8 [ R_OPCODE ]
0040101B p=00401071 l=00005E06 g=00005E06 t=00000001 c=00000001 50->50 [ R_OPCODE ]
0040101C p=0040101B l=00005E07 g=00005E07 t=00000001 c=00000001 54->54 [ R_OPCODE ]
0040101D p=0040101C l=00005E08 g=00005E08 t=00000001 c=00000001 6A->6A [ R_OPCODE ]
0040101F p=0040101D l=00005E09 g=00005E09 t=00000001 c=00000001 68->68 [ R_OPCODE ]
00401024 p=0040101F l=00005E0A g=00005E0A t=00000001 c=00000001 68->68 [ R_OPCODE ]
00401029 p=00401024 l=00005E0B g=00005E0B t=00000001 c=00000001 6A->6A [ R_OPCODE ]
0040102B p=00401029 l=00005E0C g=00005E0C t=00000001 c=00000001 6A->6A [ R_OPCODE ]
0040102D p=0040102B l=00005E0D g=00005E0D t=00000001 c=00000001 E8->E8 [ R_OPCODE ]
00401032 p=77E9652D l=00005F6D g=00005F6D t=00000001 c=00000001 58->58 [ R_OPCODE ]
00401033 p=0040103D l=00005F6E g=00005F6E t=00000001 c=0000000F 68->68 [ R_OPCODE R_MULTIEXEC ]
00401038 p=00401033 l=00005F6F g=00005F6F t=00000001 c=0000000F E8->E8 [ R_OPCODE R_MULTIEXEC ]
0040103D p=77E84B7F l=00005FAE g=0000611D t=00000001 c=0000000E EB->EB [ R_OPCODE R_MULTIEXEC ]
00401046 p=77E92CA5 l=00000025 g=00005FC7 t=00000002 c=00000001 55->55 [ R_OPCODE ]
00401047 p=00401046 l=00000026 g=00005FC8 t=00000002 c=00000001 E8->E8 [ R_OPCODE ]
0040104C p=77F92536 l=0000006A g=0000600C t=00000002 c=00000002 8B->8B [ R_OPCODE R_SEHHANDLER ]
00401050 p=0040104C l=0000006C g=0000600E t=00000002 c=00000001 EB->EB [ R_OPCODE ]
00401052 p=00401047 l=00000027 g=00005FC9 t=00000002 c=00000001 64->64 [ R_OPCODE ]
00401058 p=00401052 l=00000028 g=00005FCA t=00000002 c=00000001 64->64 [ R_OPCODE ]
0040105E p=00401058 l=00000029 g=00005FCB t=00000002 c=00000001 E8->E8 [ R_OPCODE ]
00401063 p=00401071 l=0000002C g=00005FCE t=00000002 c=00000001 33->33 [ R_OPCODE ]
00401065 p=00401063 l=0000002D g=00005FCF t=00000002 c=00000002 F7->F7 [ R_OPCODE R_EXCEPTION ]
00401067 p=00401050 l=0000006D g=0000600F t=00000002 c=00000001 64->64 [ R_OPCODE ]
0040106D p=00401067 l=0000006E g=00006010 t=00000002 c=00000001 58->58 [ R_OPCODE ]
0040106E p=0040106D l=0000006F g=00006011 t=00000002 c=00000001 5D->5D [ R_OPCODE ]
0040106F p=0040106E l=00000070 g=00006012 t=00000002 c=00000001 C3->C3 [ R_OPCODE ]
00401070 p=0040105E l=00005E04 g=00005E04 t=FFFFFFFF c=00000002 90->90 [ R_OPCODE R_MULTITHREAD R_MULTIEXEC ]
00401071 p=00401070 l=00005E05 g=00005E05 t=FFFFFFFF c=00000002 C3->C3 [ R_OPCODE R_MULTITHREAD R_MULTIEXEC ]
00401078 p=00401038 l=00005F70 g=00005F70 t=00000001 c=0000000F FF->FF [ R_OPCODE R_MULTIEXEC ]
0040107E p=0040102D l=00005E0E g=00005E0E t=00000001 c=00000001 FF->FF [ R_OPCODE ]
00403038 p=00000000 l=00000000 g=00000000 t=00000000 c=00000000 56->BB [ R_VARIABLE ]
00403039 p=00000000 l=00000000 g=00000000 t=00000000 c=00000000 30->B0 [ R_VARIABLE ]
...
00403042 p=00000000 l=00000000 g=00000000 t=00000000 c=00000000 00->E9 [ R_VARIABLE ]
00403043 p=00000000 l=00000000 g=00000000 t=00000000 c=00000000 00->77 [ R_VARIABLE ]

                                 SYNTHESIS
                                 ---------

     Combining  disassembling  and  tracing,  it is possible to obtain more
 info  about  program,  such  as  modifying  variables,  blocks  of  unused
 instructions, or even execution graph, including threads and DLLs.

     Moreover.   Tracing  allows  us  to  get  info  not  only  about  main
 executable,  but  also  about  DLLs  it uses, and, as such, to disassemble
 these  DLLs  after  tracing, to be able to modify both main file and these
 DLLs  in  a  single  context. This means that we have an ability to insert
 parts  of  virus  into different files, EXE and/or DLL's it uses, but with
 known order of execution.

                                   * * *
